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Bayes-Optimal Joint Channel-and-Data Estimation 
for Massive MIMO with Low-Precision ADCs 

Chao-Kai Wen, Chang-Jen Wang, Shi Jin, Kai-Kit Wong, and Pangan Ting 


Abstract —This paper considers a multiple-input multiple- 
output (MIMO) receiver with very low-precision analog-to- 
digital convertors (ADCs) with the goal of developing massive 
MIMO antenna systems that require minimal cost and power. 
Previous studies demonstrated that the training duration should 
be relatively long to obtain acceptable channel state Information. 
To address this requirement, we adopt a joint channel-and-data 
(JCD) estimation method based on Bayes-optimal Inference. This 
method yields minimal mean square errors with respect to the 
channels and payload data. We develop a Bayes-optimal JCD 
estimator using a recent technique based on approximate message 
passing. We then present an analytical framework to study the 
theoretical performance of the estimator in the large-system limit. 
Simulation results confirm our analytical results, which allow 
the efficient evaluation of the performance of quantized massive 
MIMO systems and provide insights into effective system design. 

Index Terms —Bayes-optimal Inference, joint channel-and-data 
estimation, low-precision ADC, massive MIMO, replica method. 


I. Introduction 

Fifth-generation (5G) mobile communication systems are 
expected to achieve a 1,000-fold increase in capacity, a 10- 
fold increase in spectral and energy efficiencies, and a 25- 
fold increase in average cell throughput [1]. Such significant 
enhancements can be achieved with large-scale multiple-input 
multiple-output (MIMO) antenna systems, which are also 
referred to as “massive MIMO” systems, e.g., [1^]. These 
systems employ hundreds, or even thousands, of antennas at 
base stations (BSs) to serve tens or hundreds of user terminals 
with the same timefrequency resources. As such, array gains 
are expected to grow infinitely with the number of antennas 
at the BSs, in which case the radiated energy efficiency 
increases dramatically and multiuser interference is eliminated 
completely. 

However, the high dimensionality of massive MIMO sys¬ 
tems considerably increases hardware cost and power con¬ 
sumption. In particular, the hardware complexity and power 
consumption of analog-to-digital converters (ADCs) increase 
exponentially with the number of bits per sample [5] and thus 
present a major obstacle. This drawback has motivated the 
use of low-cost low-precision ADCs (e.g., 1-3 bits) for anten- 
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nas, which has resulted in quantized MIMO systems.' Such 
coarse quantization leads to the failure of all communication 
theories, as well as signal processing techniques dedicated to 
high-resolution quantization [6-9]. Some aspects of quantized 
MIMO systems have been studied in the literature on capacity 
analysis [10-12], energy efficiency analysis [13,14], feedback 
codebook design [15], data detection [16-24], and channel 
estimation [18,20,23-26]. 

The current work is focused on data detection and channel 
estimation for quantized MIMO systems. Previous studies on 
these subject mainly assumed perfect channel state informa¬ 
tion (CSI) at the receiver (CSIR) or considered problems in 
channel estimation. The use of coarse quantization greatly 
reduces the number of effective measurements and hinders 
the easier acquisition of CSIR in quantized MIMO systems 
than in unquantized ones. As explained in [18], a one-bit 
quantized MIMO system requires an extremely long training 
sequence (e.g., approximately 50 times the number of users) 
to achieve the same performance as that in a full CSI case. 
This requirement motivates us to consider joint channel-and- 
data (JCD) estimation, in which the estimated payload data are 
utilized to aid channel estimation. A major advantage of JCD 
estimation is that relatively few pilot symbols are required to 
achieve equivalent channel and data estimation performances 
[27,28]. 

Although an improved performance with the JCD technique 
is expected, its performance in quantized MIMO systems is 
not clearly understood.^ The most related work appears to 
be that in [20], which investigated the achievable through¬ 
put in a one-bit quantized single-input single-output (SISO) 
channel using JCD estimation (i.e., least squares channel 
estimation jointly on pilot and data symbols). For the one- 
bit quantized MIMO system in [20], the authors considered 
a pilot-only scheme with least-squares channel estimation, 
followed by data detection that utilizes the maximal-ratio 
combining. Although high-order constellation, such as 16- 
QAM, was found to be capable of being supported by the 
one-bit quantized MIMO system, which outperforms the ones 
reported in [18] for QPSK, the long training sequence is still 
a requirement. Hence, the fundamental performance limits on 
quantized MIMO systems imposed by the JCD estimation 
represents an interesting research topic. 

’ADCs with a typical precision of 8-12 bits are used in modern commu¬ 
nication systems to process received signals in the digital domain. In this 
paper, the “quantized” MIMO system specifically represents a MIMO system 
equipped with very low-precision ADCs (e.g., 1-3 bits). 

^In the context of an unquantized MIMO system, several aspects of the 
JCD estimation have been widely studied, see e.g., [27,28]. 
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In the present work, we propose a framework for analyzing 
the achievable performance of quantized MIMO systems with 
JCD estimation. Unlike other JCD estimation schemes based 
on suboptimal criteria [20,27,28], the Bayes-optimal inference 
for JCD estimation is used in this work because this approach 
generates minimum mean square errors (MMSEs) with respect 
to (w.r.t.) the channels and data symbols. In the conference 
version of this work [29], our simulation results indicate 
that the Bayes-optimal JCD estimator exhibits a signihcant 
advantage over pilot-only schemes in quantized MIMO sys¬ 
tems. In addition to the derivations omitted in [29], the main 
contributions of this work are summarized as follows. 

• To implement the Bayes-optimal JCD estimator, we use 
a variant of belief propagation (BP) in approximating 
the marginal distributions of each data and channel com¬ 
ponent. We modify the bilinear generalized approximate 
message passing (BiG-AMP) algorithm in [30] and adapt 
it to the quantized MIMO system by providing the 
corresponding closed-form expressions for the nonlinear 
steps. We refer to this scheme as the GAMP-based JCD 
algorithm.^ 

• By performing a large-system analysis based on the 
replica method from statistical physics, we show the 
decoupling principle for the Bayes-optimal JCD esti¬ 
mator. That is, in the large-system regime, the input 
output relationship of a quantized MIMO system using 
the Bayes-optimal JCD estimator is decoupled into a 
bank of scalar additive white Gaussian noise (AWGN) 
channels w.r.t. the data symbols and channel response. 
This decoupling property allows the characterization of 
several system performances of interest in an intuitive 
manner. In particular, the average symbol error rate (SER) 
w.r.t. the data symbols and the average MSE w.r.t. the 
channel estimate for the Bayes-optimal JCD estimator are 
determined. 

• Einally, computer simulations are performed to verify the 
efficiency of the proposed GAMP-based JCD algorithm 
and the accuracy of our analysis. The high accuracy of 
our results ensures the quick and efficient evaluation of 
the performances of quantized MIMO systems. Several 
useful observations related to system design are derived 
from the analysis. 

Notations —Throughout, for any matrix A, Aij refers to the 
(i, j)th entry of A, A^ denotes the transpose of A, is the 
conjugate transpose of A, and tr(A) denotes its trace. Also, I 
denotes the identity matrix, 0 is the zero matrix, || • ||f denotes 
the Erobenius norm, E[-] represents the expectation operator, 
log(-) is the natural logarithm, and sign(-) is the signum 
function. In addition, a random vector z drawn from the proper 
complex Gaussian distribution of mean p and covariance ft 
is described by the probability density function: 

AAc(z;/r,f^) = 

det(7rS2) 

where det(*) returns the determinant. We write z ^ 

^In this paper, the Bayes-optimal JCD estimator is regarded as the theoret¬ 
ical optimal estimator, whereas the GAMP-based JCD algorithm is regarded 
as a practical method for approximating the theoretical optimal estimator. 
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Fig. 1. The quantized MIMO antenna system. 


A/c(z;/X, fi). With Dz denoting the real (or complex) Gaus¬ 
sian integration measure, for an n x 1 real valued vector z, 
we have 

^2 

n z 

Dz = TT^(zj)dzi with = —=; 

i=i v2^ 

or Dz = nr=i ~— ^ ^ dRe(zi)dIm(zi) for the 

complex valued vector, where Re( ) and Im( ) extracts the 
real and imaginary components, respectively. Einally, 

If® 2 

/ e-Vdf. 

denotes the cumulative Gaussian distribution function [31]. 


II. System Model 


We consider a MIMO uplink system in which a BS is 
equipped with N receive antennas serving K single-antenna 
users. The channel is assumed to be flat block fading, wherein 
the channel remains constant over T consecutive symbol in¬ 
tervals (i.e., a block). The received signal Y = \Ynt] G 
over the block interval can be written in matrix form as 


Y = 


1 


zHX + W = Z + W, 


( 1 ) 


y/K 

where X = [Xkn\ € denotes the transmit symbols 

in the block, H = [Hnk] € denotes the channel 

matrix containing the fading coefficients between the transmit 
antennas and the receive antennas, W = \Wnt] S 
represents the additive temporally and spatially white Gaussian 
noise with zero mean and element-wise variance cr^, and we 
dehne Z = [Z„t] = ^HX e 

On the receiver side, each received signal is down-converted 
into analog baseband and then discretized using a 
complex-valued quantizer Qc, as illustrated in Eigure 1. Here, 
each complex-valued quantizer Qc(-) is dehned as Y^t = 
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Qc{Ynt) = Q(Re{y„t}) + jQ(Im{y„t}), i.e., the real and 
imaginary parts are quantized separately. In practice, a variable 
gain amplifier (VGA) with an automatic gain control (AGC) is 
used before the quantization to ensure that the analog baseband 
is within a proper range, e.g., (—1, +1). Analog baseband Ynt 
is assumed to include the AGC gain and is thus in a proper 
range. The resulting quantized signal Y = [Ynt] G is 

therefore given by 

Y = Q,(Y) = Qe(Z + W), (2) 


where the quantization is applied element-wise. 

Specifically, each complex-valued quantizer Qc consists of 
two real-valued B-bit quantizers Q. Each real-valued quantizer 
maps a real-valued input to one of the 2® bins, which are char¬ 
acterized by the set of 2® — 1 thresholds [ri,r 2 ,... ,r 2 B_i], 
such that — oo < ri < r 2 < • • • < r 2 B_i < oo. For notational 
convenience, we define tq = —oo and r 2 B = oo. The output is 
assigned a value in rt] when the quantizer input falls in 

the interval rj,] (namely, the 6-th bin). For example, the 

threshold of a typical uniform quantizer with the quantization 
step size A is given by 


n= {-2^-^+b)A, for 6=1,...,2*5-1, (3) 

and the quantization output is assigned the value ^ when 
the input falls in the 6-th bin."* Figure 1 shows an example of 
the 3-bit uniform quantizer. Notice that in practice, the VGA 
gain can be adjusted to attain the desired step size A. 

The channel matrix H needs to be estimated at the receiver; 
thus, the first Tt symbols of the block of T symbols serve as 
the pilot sequences. The remaining Td = T — Tt symbols are 
used for data transmissions. The training and data phases are 
referred to as t-phase and d-phase, respectively. This setting 
is equivalent to partitioning X and Y as 


X = 


Xt Xd 


with Xt € 


^KxTt 


, Xd e 


"KxT, 


(4a) 


Similarly, we assume that each entry is drawn from 
a complex Gaussian distribution A/c(0,(t^), where is the 
large-scale fading coefficient. If PniHnk) = then 

N K 

Ph(H) = J] J]PH(iT„fc). (6) 

n—1k—1 

To prevent the key features of our results from being obfus¬ 
cated by complex notations, we consider the case in which 
all the users have the same large-scale fading factor in the 
main text. A generalized version of our main result, in which 
the users have different large-scale fading factors, is presented 
in Appendix E. This generalization can be easily achieved by 
plugging user index k into cr^. 

III. Bayes-Optimal JCD Estimation 

We consider the case in which the receiver knows the 
distributions of H and X but not their realizations. In the 
conventional pilot-only scheme, the receiver first uses Yt and 
Xt to generate an estimate of H and then uses the estimated 
channel for estimating data Xd from Yd [18]. In contrast to the 
pilot-only scheme, we consider JCD estimation, where the BS 
estimates both H and Xd from Y given Xt. We will treat the 
problem under the framework of Bayesian inference, which 
provides a foundation for achieving the best MSE estimates 
[32]. 


A. Theoretical Foundation 


We define the likelihood, i.e., the distribution of the received 
signals under (2) conditional on the unknown parameters, as 


N T 

P out (Y|H,X) = nn P out (yYnt 

n—1 t—1 



(7) 


Y = 


Yt Yd 


with Yt e Yd e (4b) 


We assume that Xt (or Xd) is composed of independent 
and identically distributed (i.i.d.) random variables Xt (or Xd) 
drawn from the known probability distribution Px, (or Pxj), 


I.e., 


K Tt 


Px(x)= nnpx.(^t. 


, k=l t=l 



K Td N 

nijpx. (Xd,fet) 


k=l t=l 


= Px,(X,) 


where 
P out 


(?|z) 


y 


^ rVb (^-Re(Z)j-= > 

— / e ^ dy 


1 rrt' (y-Im(Z))^ \ 

X I .—^ / e ^ dy (8) 




■'w ^ 


when Re(Y) € {rb-i,rh] and Im(Y) € Tf,/]. Based on 

the cumulative Gaussian distribution function (see the defini¬ 
tion in Notations), (8) becomes 


(5) 

Given that the pilot and data symbols should appear on con¬ 
stellation points uniformly, the ensemble averages of {Xt^kt} 
and {Xd^kt} are assumed to be zero. In addition, we let 
and be the transmit powers in the t-phase and d-phase, 
respectively, i.e., E{|Xt,fctp} = and E{|Xd,fctp} = 

For ease of notation, we refer an entry of X to Xkt instead 
of Xt^kt or Xtikf Therefore, we use Tt = {I,---,?]:} and 
7d = {7t + l,---)7’} to denote the sets of symbol indices in 
the t-phase and d-phase, respectively. 

■^This output assignment is only trae for 6 = 1,..., 2® — 1. If fe = 2®, the 
quantization output is assigned the value (2®“^ — 2“^) A. 


where 

d/b(x) = 


(y|z) =T-b(Re(Z))T'b,(lm(Z)), (9) 

( 10 ) 

\ J \ J 


The prior distributions of X and H are given by (5) and (6), 
respectively. The posterior probability can then be computed 
according to Bayes’ rule as 

( 11 ) 

P(Y) 



where 


P(Y)= f [ P(Y|H,X)PH(H)Px(X)dHdX (12) 

Jh dx 

is the marginal likelihood. 

Given the posterior probability, an estimator for Hnk can 
be obtained by the posterior mean 


Hnk = J H„k^{H„k)dHnk, (13) 

where 

^{Hnk)= [ dH /dXP(H,X|Y) 

denotes the marginal posterior probability of Hnk- In (13), 
the notation denotes the integration over all the 

variables in H except for Hnk- Similarly, the estimator for 
Xo fct for o G {t,d} can be obtained by the posterior mean 

Kkt = J ^{Xo,kt)XoMdXo,kt, ( 14 ) 

where 

^iXoM)= / dH / dXP(H,X|Y) 

Ju JX\X„,fct 

is the marginal posterior probability of Xg^kt- The notation 
/x\x denotes the integration over all the variables in 

X except for Xo^kt- The posterior mean estimators (13) and 
(14) minimize the (Bayesian) MSB [32] defined as 

mse(H) = ^E{|lH-H|l2}, (15a) 

mse(X<,) = ^E{|lXo-X„|l2}, foroG{t,d}, (15b) 

where the expectation operator is w.r.t. P(Y,H,Xo); more¬ 
over, H = [Hnk] and Xq = [Xo kt]- We refer to (13) and (14) 
as the Bayes-optimal estimator. 

Remark 1: Given a known pilot matrix, X^, which by defi¬ 
nition is given by Px, (Xt) = i5(Xt — Xt)’ obtain X^^kt = 
X.tkt from (14); therefore, mse(Xt) = 0. For the case of 
interest, we always have mse(Xt) = 0. The algorithm as 
well as the analytical results still work even if the pilots are 
unknown, and the MSB can be expressed as (15b). 


B. Bayes-Optimal Estimator in SISO Channel 

To better understand the Bayes-optimal estimator, we first 
explain it in a simple SISO system. We consider a SISO 
version of the system (1) given by 

Y = Z + W. (16) 


Recall that W is the additive white Gaussian noise with zero 
mean and variance cr^. After a complex-valued quantizer, Y = 
Qc(Y) is obtained. Based on the system model (1), Z = HX 
should be kept. However, to facilitate interpretation, we first 
let Z be a random variable with distribution Pz- According to 
Bayes’ rule (11), the posterior probability can be computed as 


P{Z\Y) 


Pout{Y\Z)Pz{Z) 

P{Y) 


(17) 


where P(Y) = / Pout(Y|z)Pz(^) dz is the marginal likeli¬ 
hood. Then, from (13) or (14), the posterior mean estimator 
for Z is given by 


Z = 


zP{z\Y) dz. 


(18) 


To specify the estimator, we further assume that Z is a 
proper complex Gaussian with mean p and variance i.e., 
Pz(Z) = Afc{Z;p,v^). Then, we derive the estimator (18) 
under the two channels, unquantized and quantized, in the 
following examples. 

Example 1 (Unquantized Channel). Jn this case, we have 
Y = Y and Pout(Y|Z) = . Using these 

distributions, we obtain 


Pout(Y|Z)Pz(Z) = Mc{Z- Y, al)Mc{Z-p, v^) 
yPY -f alp alvP \ 


= D-Nc\ Z: 


at, -P vP at, -P vP 


(19) 


where D = A/c(0; Y — p,al -\- yP), and the second equality 
follows the Gaussian reproduction property [33, (A.7)].^ Sub¬ 
stituting (19) into (17), we obtain 


P{Z\Y)=Nc 



yPY -P alp 
al + yP 


al+yP j' 


( 20 ) 


The estimator (18), which is the mean of P(Z|Y) after 
rearranging is determined as 

Z = p+^^{Y-p). (21) 

al+yP 


The MSB of the estimator, which is the variance of P(Z|Y), 
is 


(yP)'^ 

al+yP' 


( 22 ) 


Example 2 (Quantized Channel). If Re(Y) G (rh_i,rt,] and 
Im(Y) G (rk'-i,rb>], then the likelihood of the quantized mea¬ 
surement Y is given by (9). The calculation of the posterior 
mean and variance in the quantized channel is technical, but it 
basically follows a procedure similar to that in the unquantized 
channel. A derivation is given in Appendix A, which turns out 
to yield 


_ ^ sign(Y)uP / (j){r]i) - \ 

^ ^2(0-2 -PvP) - d>iV2) J ' 

yP (yP)^ 

2 2(al -P yP) ^ 

( vMm) - 72(/>(??2) / 0 ( 71 ) - Hm) 
i $( 71 ) - 4 -( 72 ) V 4 >( 7 i) - 4 >( 72 ) 


(23) 



, (24) 


^The product of two Gaussians gives another Gaussian [33, (A.7)]: 

Nc{x-, a, A)Mc{x', b, B) = D - Mc{x', c, C), 

where c = a +B-^h), C = (A-l +5-1)-!, and D = Ac(0; a- 

b,A + B). 
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where 


sign(y)p- min{|r{,_i|, |rb|} 
^{al+vP)/2 

(25a) 

sign(y)p- max{|r{,_i|, |rh|} 
^ial+vP)/2 

(25b) 


The real and imaginary parts are quantized separately, and each 
complex-valued channel can be decoupled into two real-valued 
channels. The expressions (23) and (24) are the estimators 
only for the real part of Z. To facilitate notation, we have 
abused Y and Z in (23) and (24) to denote Re(y) and Re(2’), 
respectively. The estimator for the imaginary part lm{Z) can 
be obtained analogously as (23) and (24), while Y and b should 
be replaced by Im(y) and b', respectively. 

Remark 2: Recall ro = —oo and r 2 B = oo. Therefore, if 
b = 1 or b = 2^, we obtain (^( 772 ) = 0, r] 2 (j){r] 2 ) = 0, and 
$( 772 ) = 0. Additionally, for the special case of B = 1 (i.e., 
one-bit quantizer), the expressions of (23) and (24) agree with 
those reported in [34]. 

Remark 3: For another extreme case of B —> 00 and 
A —0, we return to the unquantized channel in Example 
1. Instead of using the procedure in Example 1, we show 
how the expressions (21) and (22) can be obtained from (23) 
and (24). Recall that rt,_i and are the upper and lower 
bin boundary positions w.r.t. the 6 -th bin, respectively. Let 
rb-i = r and rj, = Vb-i + dr. As B —)■ c» and A —> 0, 
we obtain dr —> 0, which results in rj, —>■ r and 771 —>■ 
772 — 77. Eurthermore, we obtain $(771) — $(772) —)■ ^$(77), 
ay</'( 77 ), and 

By substituting these relationships into (23)-(24) and applying 
the facts that ^$( 7 ?) = (/)(77)^?7, ^^( 77 ) = -77(/>(77)^77, and 
^'ri(j){r]) = {l — ri'^)(j){r])-^r], we recover the same expressions 
as those given in (21) and (22) for the real part of Z. The 
imaginary part for Z can be obtained analogously. 

The aforementioned example is the estimator for Z. The 
same concept can be easily applied to the estimate of H or 
A, if Z is replaced by iF or A in (16). However, if Z = HX 
and both H and A are unknown, the complexity of the Bayes- 
optimal estimator increases. In this case, the posterior proba¬ 
bility in (17) becomes P{H,X\Y) = 

which involves two prior distributions for H and A as that 
in (11). To implement the posterior mean estimator for H 
and A, we need the marginal posterior probabilities ^{H) = 
J P{H,X\Y)dX and ^{X) = JP{H,X\Y)dH, resp^- 
tively. A closed form for the posterior probability P{H, X\Y) 
does not appear possible. Although we can resort to numerical 
integration to implement the estimator, the computational com¬ 
plexity is high. Therefore, one might consider an alternative 
technique; that is, the estimate of H is performed with fixed 
A and vice versa. 

In the next subsection, we develop a practical algorithm for 
the Bayes-optimal estimator in the quantized MIMO system. 
Before proceeding, we intend to provide an intuition on the 
algorithm. A representation of the algorithm is shown in Eig. 2, 
which seems to operate in the alternative manner. Conceptu¬ 
ally, when the posterior mean and variance of Z are obtained 
from the quantized observation Y, we can reconstruct Y and 



Fig. 2. A representation of the GAMP-based JCD algorithm. 



Ph(7?32) 


Channel Variables 

Fig. 3. Factor graph representation of the integrant of (26), where TV = 3, 
iV = 2, and T = 2. 


then approximate PoutiY\Z) as a Gaussian distribution. Then, 
the posterior mean estimator for H (or A) can be conducted 
through Y, which is an AWGN channel rather than a quantized 
channel, in an alternative manner. This representation is merely 
for an intuition. The accurate algorithm development takes a 
different route. 


B. GAMP-Based JCD Algorithm 

Erom the discussions above, the direct computations of 
(13) and (14) are intractable because of the high-dimensional 
integrals in the marginal posteriors ^{Xkt) and ^{Hnk)- To 
make these tractable, we first note that by combining (5)-(7), 
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the posterior probability (11) can be factored into 


1 

p(T) 


N T 


N K 


nn Pout I^ n n 


n—l t—1 

K Tt 


n—lk—1 
K Td 


X IIII Pxt(^t.fei) X PXd(^d,fct)- 

fc=lt=l fe=lt=l 


(26) 


An example factor graph for (26) is shown in Figure 3, where 
a square represents a factor node associated with the sub¬ 
constraint function Pout{Ynt\Znt) in (26), whereas a circle 
shows a variable node associated with Hnk, or Xt^kt- 

The factor graph suggests the use of the canonical sum-product 
algorithm to compute the marginal posterior probabilities. The 
algorithm uses a set of message-passing equations that go from 
factor nodes to variable nodes and vice versa. 

However, the computational complexity of the sum-product 
algorithm remains infeasible in the case of interest because 
it still involves high-dimensional integration and summation. 
Thus, we resort to recently developed approximation algo¬ 
rithms: the so-called AMP (approximate message-passing) 
algorithm [35] and the generalized AMP (GAMP) algorithm 
[36]. In particular, the AMP algorithm, which is a variant of 
the sum-product algorithm, was initially proposed by Donoho 
et al. [35] to solve a linear inverse problem in the context 
of compressive sensing. The use of GAMP in our MIMO 
system means that given H is known; thus, GAMP can provide 
a tractable method to approximate the marginal posteriors 
3^{Xkty^- This part corresponds to addressing the message¬ 
passing equations between Y^t and Xkt, i e., the left-hand side 
of Figure 3. For the study, see [19,37]. More recently, Parker 
et al. in [30] applied a similar GAMP strategy, referred to 
as BiG-AMP, to the problem of reconstructing matrices from 
bilinear noisy observations (i.e., reconstructing H and X from 
Y). 

BiG-AMP for JCD estimation is presented in Algorithm 1 
for a given instantiation of the quantized observations Y, the 
pilot matrix Xt, as well as the likelihood Pout(Y|Z), and the 
variable distributions Ph(H) and Pxj(X(j). We refer to this 
scheme as the GAMP-based JCD algorithm, which follows 
the same structure as BiG-AMP [30] except for the steps in 
dealing with the known pilots, i.e., t G % in Algorithm 1. 
Reference [30] presents the derivation details of BiG-AMP. 

To better understand the algorithm, we provide some in¬ 
tuition on each step of Algorithm 1. Fig. 2 illustrates the 
structure of the algorithm. In lines 3-6, an estimate Pd = [Pnt] 
of the matrix product Zd = HXd and the corresponding 
variances {vf^^ : t G Td} are estimated. Pd = [pnt] and : 
f € 7d} in lines 3 and 4 can be regarded as auxiliary variables.® 
The same procedure is followed in lines 1 and 2 but for the 
matrix product Zt = HXt. Given that the pilot matrix Xt 
is known, the corresponding variances for Xt are zero, i.e., 
= 0 for f G 7t- With = 0, we thus have plugged pnt 
and into pnt and for f G 7t to obtain lines 1 and 2. 
The posterior means Z = [Znt] and variances of Z 


®Pd is a plug-in estimate of Z^, whereas Pj = \pnt\ provides a refinement 
by introducing the “Onsager” correction into the context of AMP. For details, 
see [30]. 


Algorithm 1: GAMP-based JCD Algorithm 

input : Quantized observations Y, pilot matrix Xt, likelihood 

PoLit{Y|^), and variable distributions and PXd(^d) 

output : H, Xd 

definition: Ek = ELl E„ = E^.i 

initialize : ^ 1; Vn, t: Snt(O) = 0, v^t(O) = 1, Z„t(0) = 0; 

\fn,k,t: = 1, Xd,fet(l) = 0, n[[j,(l) = 1, 

HnkT) = 0. 


while 




> e and ^ 


do 


E„,t |z^t(e-i)p 

if t S 7t then 

Vn: vl,{i) = Y.k<mXkt?-, 

Vn: PntiO = T,k HnkT)Xkt - Snt{t - l)v[]t(G; 


3 

4 

5 

6 

7 


9 

10 

11 

12 


13 

14 


if f G Td then 

Vn: vlPi) = Efe \Hnk(i)?vUi)+v}^^{0\Xkt{0?-, 
Vn:p„AG = Efcj?nfc(0^fet(G; 

Vn: vlPi) =tl^t(?)-l-Efct^nfe(?)’'«(0; 

Vn: PntiO = Pnt(f) - Snt(i - lXt(G; 

Vn,i: <i(G = Var{Z„t|p„t(0><tK)}; 

Vn,i: Z„t(G = E{Z„t|p„AG,<t(0}; 

Vn,f: <,(G = (1 

Vn,f: SntiO = (ZntiO - PntiO)/v^P(); 

Vk,t: vi,iO = [E„l^nfe(GP<t(0]“'; 

Vk,t: f«(f) = XktiO {l-vltiOJLn^kTXtiO) 

Vn,fc: = [Et67t ; 

Vn, k: q„kiO = -^nfc(0 (l - T.teT, 

+''^nfe(?)(Et6T; X*_i.Snt{i) 

+ EtGT^j-^fct(€)'^rit(?)] ; 


15 

16 

17 

18 


Vfc,t STd: + =Var{Xfct|fM(G,n«(0}; 

Vfc,t STd: Xfet(^ + 1) = 

Vn,fc: n®j,(^ + l) = Var{iV„j, |g„i,(^), (G}; 

Vn,fc: iV„fc(5 + l) = E{H„k\qnk{0,yLX}-, 




are obtained in lines 7 and 8 using {pnt,Vnt}- Subsequently, 
the posterior moments are used in lines 9 and 10 to compute 
the residual S = [s„(] and the inverse residual variances 
In lines 11 and 12, these residual terms are used to 
compute R = [fkt] and Ktl’ where rkt can be interpreted as 
an observation of Xd.kt under an AWGN channel with zero 
mean and a variance of Similarly, Q = [qnk] and 
where qnt can be interpreted as an observation of Hnk under 
an AWGN channel with noise variance of are evaluated 
in lines 13 and 14. Finally, the posterior mean X = [Xkt] and 
variances are estimated in lines 15 and 16 by taking 

into account the prior Px^; the same is performed for Hnk in 
lines 17 and 18. 
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C. Nonlinear Steps 

Algorithm 1 provides a high-level description of BiG-AMP 
to perform JCD estimation. Lines 7-8, 15-16, and 17-18 
of Algorithm 1 perform the posterior mean and variance 
estimators for Znt, X^t, and Hnk, respectively. A remarkable 
feature of the algorithm is that at each iteration, the estimates 
of Znt, Xkt, and Hnk can separately serve as the estimators 
over a bank of scalar channels. Next, we describe these 
nonlinear steps in detail. For brevity, we omit the subscript 
indexes n,k,t hereafter. 

First, we notice that lines 7-8 compute the posterior mean 
and variance of Z; in this computation, the expectation oper¬ 
ator is w.r.t. 


i^{Z) 


P out {Y\Z)Pz{Z) 
!PontiY\z')Pz{z')dz'' 


where Po\it{Y\Z) is given by (9), and Pz{Z) = Afc{Z-,p,vP). 
This process is identical to that implemented in Example 2. As 
a result, lines 7-8 of Algorithm 1 for each real-valued channel 
can be computed using the expressions in (23)-(24). 

Next, we discuss the nonlinear steps used to compute 
{X,v^) and {H,v^) in lines 15-16 and 17-18 of Algorithm 
1. Specifically, the expectations and variances in lines 15-16 
and 17-18 are taken w.r.t. the marginal posterior 


^(^d) 


Mc{X^-,r,v-)P^^{X^) 

/ f, 7;’')Px,(x'J ’ 

Nc{H-A,v^)Ph{H) 

J Mc{h'-A,v^)Pu{h')dh'- 


(27) 

(28) 


These posterior probabilities are similar to those of {Z,v^), 
except that Po^t is replaced with a Gaussian distribution and 
the corresponding priors Pxj and Ph are used in place of 
Pz- In fact, the former change results in an estimator that is 
fundamentally different from that in the case of Z. In Example 
1, if Pout is a Gaussian distribution, then the estimator is 
operated in an unquantized channel. That is, the estimates of 
H and X in Algorithm 1 are based on AWGN channels. 

To specify {X,v^), we consider the square QAM constel¬ 
lation with 2iy X 2^ points 


<T = {Xr + jXi : Xr, Xi G {-i2iz - 1)C,..., -3C, -C, 
C,3C,...,(277-1)C}}, (29) 


where ^ = 1/,y2{{2i')'^ — l)/3 is the power normalization 
factor. If X is drawn from the constellation points uniformly, 
i.e., Pxj(2fd) = l/(2t/)^ for X^ G X, then lines 15-16 of 
Algorithm 1 can be computed using 


^ . Er^i(2»-l)CJ^f(Re(f)) 

' EUFnMr)) 

, ■ E»=i(2»- l)C-PT(Im(f)) 

EUF^Mr)) 

, ELi(2»-l)\^ff(Im(f)) ^ 

EUiFnimif)) ' ‘ 


(30) 


( 31 ) 


where 


Fiix) = e sinh 


(2i-l)^C 

F^{x) = e cosh 


2(2i- 1)C^ 
2{2i-l)C 


Einally, recall that PniFlnk) =-^c(0, cr^). Then lines 17-18 
of Algorithm 1 can be computed using 


H = 


at +v‘i 


q and — 


q\2 


alXv<i' 


(32) 


The derivation of (32) is identical to that in Example 2. 

Using the above nonlinear steps (23)-(24) and (30)-(32), 
we implement the GAMP-based JCD algorithm based on the 
open-source “GAMPmatlab” software suite. 


IV. Performance Analysis 

In this section, we present a framework to analyze the 
Bayes-optimal JCD estimator. The key strategy for analyzing 
mse(H) and mse(Xd) is through the average free entropy 

J-4^EY{logP(Y)}, (33) 

where P(Y) denotes the marginal likelihood in (12), that is, 
the partition function. Aligned with the argument in [38, 39], 
mse(Xd) and mse(H) are saddle points of the average free 
entropy. Thus, the goal is reduced to finding (33). 

The analysis is based on a large-system limit, that is, when 
N,K,T —> oo but the ratios 

NlK = a, T/K = P, T,/K = (3t, T^IK = (34) 

are fixed and finite. Eor convenience, we simply use K —t oo 
to denote this large-system limit. Even in the large-system 
limit, the computation of (33) is difficult. The major difficulty 
in computing (33) is the expectation of the logarithm of P(Y), 
which, nevertheless, can be facilitated by rewriting F as^ 

(35) 

The expectation operator is then transformed inside the log- 
function. We first evaluate Ey{P^(Y)} for an integer-valued 
T, and then generalize the result to any positive real number r. 
This technique, called the replica method, is from the field of 
statistical physics [40], which is not mathematically rigorous. 
Nevertheless, the replica method has proved successful in 
a number of highly difficult problems in statistical physics 
[40] and information theory [27,41^7]. Some of the results 
originally obtained by the replica method have been subse¬ 
quently validated by other approaches (e.g., [48,49]). Under 
the assumption of AT — > (X) and replica symmetry (RS), an 
asymptotic free entropy can be obtained later in Proposition 
1. We check the accuracy of the replica-based analysis via 
simulations. Proposition 1 involves several new parameters. 

^We use the following formula from right to left 

lim — logE{A^} = lim = E{logA}, 

where A is any positive random variable. 
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A. Parameters of Proposition 1 

Most parameters (except for some auxiliary parameters) of 
Proposition 1 can be illustrated systematically by the scalar 
AWGN channels: 


Yx, = s/^x^,X^ + Wx,, (36a) 

Yh = ^H + Wh, (36b) 

where Wh,Wx, ~ Arc(0,1), H ~ P^, and Xd ~ Px,- 
We shall specify how the parameters qh and qx^ are related 
to the asymptotic free entropy later in Proposition 1. Thus 
far, we know that the parameters qx^ and qn serve as the 
signal-to-noise ratios (SNRs) of the above AWGN channels. 
The likelihoods under (36a) and (36b) are respectively given 
by 


P(yxJ^d) = 

TT 

P{Yh\H) = 

TT 

and then we get the posteriors 


P(XdlYxJ 

P(ffiYH) 


PxAXd)P(YxjX,) 

fdx'^ Px,(x^)P(YxJa;^)’ 

PH(ff)P(Yfflff) 

fdh'PH(h')P(YHlh')' 


(37a) 

(37b) 


(38a) 

(38b) 


Some important quantities are obtained with the posteriors. 
For example, the posterior mean estimators for Xd and H read 


Xd= JdX^XdP{Xd\Yx,), (39a) 

H = JdH HP{H\Yh). (39b) 

The MSEs of the estimators are thus given by 

msex, =E{|Xd-Xd|2}, (40a) 

mse^ = E||iJ-i7|2| , (40b) 

where the expectations are taken over P{Yxj,,Xf) and 
P{Yh,H), respectively. Equations (30)-(32) are explicit ex¬ 
pressions of the above quantities. In addition, the mutual 
information between Yxj and Xj reads [50] 

IiX,;YxMxJ = -Ey,^{logEx, }}-l, 

(41) 

and the mutual information between Yjj and H is 

I{H-,YH\qH) = -Ey^llogEff - 1. 

(42) 

Erom (36), an inference that another scalar AWGN channel 
w.r.t. the t-phase exists can be made, i.e., 

+ (43) 

where Wjc, ~ A/c(0,1) and X^ ^ Px,. As the pilot is known, 
msex, = 0 can be obtained easily following the argument in 
Remark 1; and the mutual information between Yx, and 2ft is 
0. As all the performances relating to (43) are trivial, we will 
not use (43) in the following discussions. 


B. Analytical Results 

Proposition 1: As AT —> oo, the asymptotic free entropy is 

( 2® 

DuT',(Eo)log^6(Eo) 

- aI{H]YH\qH) - /3d7(Xd; Yxj|<7Xd) 

+ ol{ch — qH)qH + ^ Po{cxo — qxjqx^, (44) 

oG{t, d} 

where 


4'h(Vo) = $ 


V2rb - K 


CHCXa — qnqXa J 

V2rb-i - Vo 


_$ 


sj + chcx„ — qnqxa 


(45) 


Vo = y/qHqXaV for o e {t, d}; /(•)’s are given by (41) and 
(42); and cx„ = E{|2f„|2} = ch = E{|i7|2} = al 
In (44), the other parameters {qx„, qn, 9Xo, 9rr} are obtained 
from the solutions to the fixed-point equations 


qH = PtqXtXtY Pdqx.Xd, qH = CH-mseH, (46a) 

qxt = ctqnXu qxt = Cjf, - msex,, (46b) 

qx, = aqnXd, qxi = cxd - msexj, (46c) 


where msex, = 0, and msen and msexj are given by (40). 
In (46), we have defined 


= E 

6=1 



{s/qMv:v )) 

^6 (s/qHqx^v) 


for o e {t, d} 


(47) 


with 'l'h(-) given by (45) and 


tXo) 

(V2r-g-VoP 


(V^^b-l-VaP 


, -iHVXa'l — 


j-lHiXo') 


^ 27 r((j 2 -1- chCXo - qnqxS) 


(48) 


Proof: See Appendix B. ■ 

As previously mentioned, the asymptotic MSEs of 
and H are the saddle points of the free entropy. Clearly, 
from Proposition 1, they are msexj and mse//, respectively. 
Notably, the MSEs are associated with the scalar AWGN 
channels (36a) and (36b). Therefore, the performances of the 
quantized MIMO system seem to be fully characterized by 
the scalar AWGN channels (36). The following proposition 
formulates such intuition. 

Proposition 2: Let Xd^^kt, Hnk, Xd^kt, and Hnk denote 
the {k,t)-th and (n, A:)-th entries of Xd, H, Xd, and 
H, respectively. As AT —> oo, the joint distribution of 
{Xd,kt, Hnk, Xd,kt^ Hnk) of channels (2), 03)^ and (14) con¬ 
verges to the joint distribution of (Xd, H, Xd, H) of the scalar 
channels (36a) and (36a). 

Proof: See Appendix C. ■ 

Proposition 2 shows that, in the large-system limit, the input 
output of the quantized MIMO system employing the Bayes- 
optimal JCD estimator is equivalently decoupled into a bank of 
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the scalar AWGN channels (36a) and (36b). This characteristic 
is known as the decoupling principle, which was introduced by 
[43] for treading an unquantized MIMO system with perfect 
CSIR. If perfect CSIR is available, then we will not need 
(36b) for treating the channel estimation quality. Notably, 
Proposition 2 extends the decoupling principle to a general 
setting. In particular, we employ the JCD estimator so that the 
decoupled AWGN channels involve not only the data symbol 
[i.e, (36a)] but also the channel response [i.e., (36b)] as well. 

Remark 4: The equivalent channels (36) are the scalar 
AWGN channels, with gj and qn being the equivalent SNRs. 
As shown in (46) and (47), the quantization effect is included 
in gd and qn through Xo for o G {t, d}. Consider the extreme 
case of B —> oo and A —> 0, i.e., the unquantized channel. 
In this case, the Riemann sums (4^) becomes the 

Riemann integral over the interval (— 00 , 00 ). Applying the 
technique in Remark 3 to (47) and evaluating the integrals, 
Xo can be simplified to 


o’S, + CffCx^ — qnqXa 


Substituting (46) for qn and gx„ in the denominator of (49), 
we obtain (j^+c//msex„ + (cx„ —msex„)mse//. The quantity 
in this form can be understood as the noise plus the residual 
interference resulting from the estimation errors of the data 
symbol and channel response. 

We focus our interest on several special cases in the 
following examples to obtain more insight into Proposition 
2 . 

Example 3 (Constellation-like Inputs). Based on Proposition 
2, the asymptotic MSEs w.r.t. Xd and H can be determined 
by using the MSEs of the scalar AWGN channels (36a) and 
(36b), respectively. Thus, if the data symbol is drawn from a 
quadrature phase-shift keying (QPSK) constellation, then we 
will derive 


msexd J ) 


mse// = 


(50) 

(51) 


1 + (Jf^qn ■ 

The SER w.r.t. Xd can also be evaluated through the scalar 
AWGN channel (36a), which is given by [51, p.269] 

. n 2 


SER = 2Q 


(52) 


where Q{x) = Dz is the Q-function. 

In fact, all these performances w.r.t. Xd can be determined 
on the basis of knowledge of the scalar AWGN channel with 
SNR gx- Thus, if the data symbol is drawn from other square 
QAM constellations, then the corresponding SER can be easily 
obtained by using the closed-form SER expression in [51, 
p.279]. 

Example 4 (Perfect CSIR). If the channel matrix H is 
perfectly known, then the t-phase will not be required so that 


/3t = 0 and /3d = /3. (53) 


Given that H is perfectly known, mse// = 0. Integrating 
this into (46a), we immediately obtain q^ = ch = which 


yields 

qnqxj = CHqxd, (54) 

CHCxa — qnqxa = c/rmsexd, (55) 

in which (55) follows the result of chCx^ — qnqx^ = 
ch{cxj — qxa) (46c). Substituting (53)-(55) into (45), 
(47) and (48), we derive more concise expressions for Xd, 
'l'b(')’ ^b(')- Notably, when particularizing our results 

to the case with the QPSK inputs, we recover the same 
asymptotic MSE expression as given in [16, (7) and (8)]. More 
precisely, in [16], the real-valued system with BPSK signal 
was considered. In such case, in our study should be 

replaced with Vb- 

Example 5 (Pilot-only Scheme). In the conventional pilot- 
only scheme, the receiver solely uses Yt and Xt to generate 
an estimate of H and subsequently uses the estimated channel 
for estimating the data Xd from Yd [18]. The analysis of the 
asymptotic MSE w.r.t. H is the same as that in Example 4, 
but the roles of H and Xt are exchanged. In particular, during 
the t-phase, we derive /3d = 0 and msex, = 0 because no data 
symbol is involved and the pilot matrix Xt is known. After 
substituting these parameters into (46) and simplification, we 
obtain the self-contained fixed-point equations 


mse/r = “ - > 

1 + <jiqH 

(56) 

qH = Ptcrl^Xt 

(57) 


with 


Xt 

In (56), mse// represents the asymptotic MSE w.r.t. H for the 
pilot-only scheme, which is also the MSE w.r.t. H for the 
scalar AWGN channel (36b). Notably, qn serves as the SNR 
of the AWGN channel (36b). Comparing qn in (57) with that 
in (46a), we determine that the second term of qn in (46a) is 
the gain achieved by data-aided channel estimation. 

Before proceeding with the analysis of estimated data in 
the pilot-only scheme, we provide the following proposition 
to obtain a better understanding of mse// in (56). 

Proposition 3: Let the channel gain and the transmit pilot 
power be normalized, that is, = 1 and = 1. In the high- 
SNR regime and /3t = T^/K ^ 1, mse// of the pilot-only 
scheme can be approximately expressed as 

mseH ~ -201ogio(A) + C'b (dB), (59) 

where C'b is a quantizer-dependent (e.g., A and B) constant. 

Proof: See Appendix D. ■ 

As an example. Table I provides the corresponding value of 
Cb for a uniform B-bit quantizer with A = In this 

case, we plot the MSEs that use the approximate expression 
(59) as well as its analytical form (56) in Eigure 4. We 
observe that, for /3t > 2, the approximation (59) matches the 
theoretical result (56) perfectly. Table I shows that the constant 
Cb satisfies Cb « —6.02B -I- 4.4895 in high-resolution cases, 
indicating that mse// decreases by 6dB for each 1 bit increase 




Dv 


(^6 ) 
4-6 - msen 


(58) 
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TABLE I 

Cb for uniform B-bit quantizer with a = 


B 

1 

2 

3 

4 

5 

6 

7 

Cb (in dB) 

2.8731 

-5.9852 

-13.0201 

-19.4804 

-25.7065 

-31.8265 

-37.6547 



Fig. 4. The asymptotic MSE w.r.t. H of the pilot-only scheme versus the 
pilot ratio /3t = Tt/K for different B-bit quantizer. 




Fig. 5. SER versus SNR for QPSK constellations. In the results, the JCD 
estimation scheme is used under the settings with a) perfect CSIR and b) 
no CSIR. Curves denote analytical results and markers denote Monte-Carlo 
simulation results achieved by the GAMP-based JCD algorithm. The MSEs 
w.r.t. H of the JCD estimator are plotted as a subfigure. 


of the rate. Notably, this property coincides with the well- 
known hgure of merit in quantization.^ From (59), given a 
fixed quantizer (i.e., fixed Cb), msejy increases by 6dB for 
each doubling of training length /3t. Consequently, doubling 
the length for training has the same effect as increasing an 
extra bit on every ADC at the massive MIMO receiver. 

The proceeding observation provides a useful guideline for 
the trade-off between the training length and the ADC word 
length. For instance, if we target /3t to that attained by the pilot- 
only scheme at msejy = —30dB, the 4 bit receiver requires 
/3t = 4, as shown in Figure 4. If we intend to reduce the ADC 
word length of each ADC to 1 bit, then the training length 
increases 2^“^ = 8 times compared with that in the 4 bit case. 
This argument shows the importance of the JCD technique in 
the quantized MIMO system. With the JCD technique, the 
estimated payload data are utilized to aid channel estimation 
so that the effective training length virtually increases. 

Then, we return to the analysis of estimated data. If the 
channel estimate is subsequently used to estimate data via the 
Bayes-optimal approach, then we can derive the corresponding 
self-contained fixed-point equations for the d-phase similar to 
(56) and (57). In particular, we calculate (46c) given a fixed 
= a^ — msen, with msejy given in (56). Given that no 
iteration process occurred between the pilots and data symbols, 
(46a) and (46b) are not involved in the d-phase. If the JCD 
technique is employed, then mse// can be further reduced. 
Any reduction in channel estimation error mse/r results in an 
increase in qh', thus, qxj = aqHXd increases. 

*The property of 6dB improvement in signal-to-quantization-noise ratio 
for each extra bit is a well-known figure of merit in the ADC literature [52, 
p.248]. 


V. Discussions and Numerical Results 
A. Accuracy of the Analytical Results 

Computer simulations are conducted to verify the accuracy 
of our analytical results. In particular, we compare the SER 
expression (52) and the analytical MSE w.r.t. H (51) with 
those obtained by the simulations. The simulation results 
are obtained by averaging over 10,000 channel realizations, 
wherein the GAMP-based JCD algorithm (Algorithm 1) is 
implemented with tolerance e = 10“® and the maximum 
number of iterations ^max = 100.® The system parameters are 
set as follows: AT = 50, N = 200, Tt = 50, and = 450. 
The SNR of the system is defined as SNR = l/cr^. The 
pilot matrix Xt S consists of statistically independent 

QPSK constellations. In the simulations, we use the typical 
uniform quantizer with a fixed quantization step size A = 1/2. 
Notably, this quantization step size is nonoptimal. The optimal 
step size will be discussed in the subsequent subsection. As the 
QPSK constellations are used to transmit data, Eigure 5 shows 
the corresponding SER results for the cases of (a) perfect CSIR 
and (b) no CSIR. The corresponding MSE w.r.t. H of the JCD 
estimator are plotted as a subfigure in Eigure 5(b). We observe 
that the GAMP-based JCD algorithm can generally describe 
the performances of the theoretical Bayes-optimal estimator. 
The performance of the theoretical Bayes-optimal estimator 
can also be described by our analytical expressions. Notably, 

®We do not show the convergence of the GAMP-based JCD algorithm 
because of space limitation. In most cases, the GAMP-based JCD algorithm 
converges after only 20-30 iterations although it shows a slow convergence 
at low SNRs. 
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the GAMP-based JCD algorithm is only an approximation 
of the Bayes-optimal JCD estimator whose implementation 
is prohibitive. For the case with no CSIR, the GAMP-based 
JCD algorithm cannot work as well as that predicted by the 
analytical result at low SNRs. At low SNRs, the GAMP- 
based JCD algorithm shows a slow convergence, such that 
the adopted maximum number of iterations is insufficient. 
This gap motivated the search for other improved estimators 
in the future. 

Figure 5(b) shows that performance degradation due to low- 
precision quantization is small. For instance, if we target the 
SNR to that attained by the unquantized system at SER= 
10“^, then the 3 bit Bayes-optimal JCD estimator only incurs 
a loss of 5.61 — 4.32 = 1.29 dB. Even with 2-bit quantization, 
the loss of 7.27 — 4.32 = 2.95 dB remains acceptable. 

B. Optimal Step-Size 

In the 1 bit ADC (i.e., B = 1), the quantization output is 
assigned the value ^ if the input is a positive number and — y 
otherwise. Eor the Bayes-optimal estimator, the performances 
are irrelevant to any particular value of A.^' This property can 
be easily achieved by reviewing the likelihood in (8), wherein 
Tf, = {—c», 0, (X)} for 6 = 0,1, 2. Notably, A is not involved 
at the beginning of data estimation. Therefore, we shall focus 
on cases with B > 1. 

Notably, Ynt is the input signal to the quantizer. Direct 
application of the central limit theorem results in that Ynt 
can be approximated as a Gaussian distribution with variance 
= l-|-cr^. Eor a Gaussian signal with unit variance, 
the optimal step size that can be used to minimize quantization 
distortion is computed in [53] and is 1.008/-\/2 « 0.7128 if 
B = 2}^ Under the same setting, that is, a = NjK = 4, 
/3 = T/K = 10, /3t = Tt/K — 1, Eigure 6 shows the 
SERs of the Bayes-optimal estimator as a function of the 
normalized step size IS./^JE{\Ynt\^} for B = 2 . We observed 
that the step size optimized in terms of the SER for the Bayes- 
optimal estimator is quite different from that for minimizing 
its distortion. 

Eigure 7 shows the optimal step sizes for different input 
signals Xd, including QPSK, 16QAM, 64AM, and Gaussian 
inputs. The optimal step size varies slightly for different input 
signals and decreases with the increase in SNR. We observe 
from other simulations that the optimal step size varies only 
slightly for different settings of a and jS. Thus, we conclude 
that the optimal step size for the Bayes-optimal estimator is 
mainly dominated by the SNR. 

We ht the optimal step sizes for different input signals by 
using a hrst-degree polynomial equation 

Aopt(snrdB) = oo + oisnrdB, (60) 

where sntdB represents the SNR in dB scale to derive a 
general expression. The (least squares ht) coefficients ao,ai 

At low SNRs, we obtain a good result by increasing the maximum number 
of iterations. 

*^This property is invalid for other estimators, such as linear estimators 
[54], 

^^The optimal step size [53] is divided by s/2 in this study because the 
signal power of the real or imaginary part is 1/ y/2. 



(a) QPSK 



(b) 16QAM 


Fig. 6. SERs versus the normalized step size under the quantized MIMO 
system with a) QPSK and b) 16QAM constellations for a = 4, ^ = 10, 
/?t = 1. Markers con'espond to the lowest SER w.r.t. the normalized step size. 
The optimal step size determined by minimizing the distortion of a Gaussian 
signal [53], i.e., A = 0.7128, is plotted as the vertical axis. 
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TABLE II 

The coefficients ao and ai of Aopt(snrdB) for B = 2,3,4. 


B 

ao 

Ol 

2 

0.6921 

-0.0154 

3 

0.4364 

-0.0118 

4 

0.2559 

-0.0071 


are listed in Table II. The optimal step sizes determined by 
using Aopt are also indicated by the shaded area in Figure 6. 
We observe that, although Aopt is nonoptimal for each specific 
input, their corresponding performances remain acceptable. 
Following the same argument, we derive the corresponding 
polynomial equation Aopt(snrdB) for different quantization 
bits, with their coefficients listed in Table II. 

C. Effects due to the Absence of CSIR 

Comparing Figures 5(a) and 5(b), we observe that the loss 
due to no CSIR is small for the Bayes-optimal JCD estimator. 
Then, we discuss the performances of the Bayes-optimal JCD 
estimator with and without the perfect CSIR under various 
system settings to obtain a better understanding on the effects 
of CSIR over the quantized MIMO system. In contrast to 
the QPSK signals used in previous simulations, we focus on 
the Gaussian inputs, that is, ~ A/c(0,1), in the subse¬ 
quent experiments. The other system parameters are the same 
as that in the previous experiment, that is, a = N/K = 4, 
/3 = T/K = 10, and /3t = T^/K = 1. Figure 8 shows the 
asymptotic MSE mse^j for the Bayes-optimal JCD estimator 
with and without perfect CSIR. The MSE for the pilot- 
only scheme is also shown. Notably, the Bayes-optimal JCD 
estimator shows a significant improvement over the pilot-only 
scheme in the 1 bit and unquantized cases. The gap between 
the Bayes-optimal JCD estimator with and without perfect 
CSIR is small in the unquantized case, whereas the gap is 
large in the case of the 1 bit quantizer. By observing the 1 bit 
and unquantized cases, we can expect that the gap decreases 
with the increase in the ADC resolution. 

A straightforward method to reduce the gap of the 1 bit 
case is increasing the training length. We provide the MSE 
results for /3t = 5 and /3t = 9 in Eigure 9 to verify this 
argument. However, the improvement achieved by increasing 
the training length is limited even if /3t = 9, leaving only 
/3d = 1 for data. Alternatively, we may consider a larger /3, but 
the maximum /3 is limited by the coherence time. If /3 = 100 
and /3t = 1, then the performance of the Bayes-optimal JCD 
estimator without perfect CSIR is approximately similar to 
the (fundamental limit) case with perfect CSIR. Nevertheless, 
such a long coherence time is usually unavailable in practice. 

VI. Conclusion 

We developed a framework for studying the best possi¬ 
ble estimation performance of the quantized MIMO system, 
namely, the massive MIMO system with very low-precision 
ADCs. In particular, we used the Bayes-optimal inference for 
the JCD estimation and achieve this estimation by applying 
the BiG-AMP technique. The asymptotic performances (e.g.. 



Fig. 8. mse_)fj versus SNR for the pilot-only scheme and the Bayes-optimal 
ICD estimator with and without perfect CSIR under the 1-bit quantization and 
unquantized receivers. a = N/K = A, (} = TjK = 10, /3t = Tt/K = 1, 
and ~ A/c(0,1). 



Fig. 9. msexj versus SNR for the Bayes-optimal ICD estimator with 1-bit 
receivers under various setting of /3 and /3t. « = 4 and Xj ~ A/c(0i !)■ 


MSEs and SERs) w.r.t. the channels and the payload data 
were derived and shown as simply characterized by scalar 
AWGN channels. Monte-Carlo simulations were conducted to 
demonstrate the accuracy of our analytical results. 

The high accuracy of the analytical expressions enable 
us to quickly and efficiently assess the performance of the 
quantized MIMO system. Thus, we obtained the following 
useful guidelines for the system design; 

• We showed that the asymptotic MSE of the channel 
estimate in the conventional pilot-only scheme decreases 
by approximately 6dB for each bit added to the ADCs 
or each doubling of training length. This finding supports 
the importance of the JCD technique, especially in the 
quantized MIMO system. 

• The optimal step size for minimizing SERs of the Bayes- 
optimal estimator were shown to be highly different 
from that for minimizing the distortion of a Gaussian 
signal and, fortunately, can be quickly determined by our 
analytical expressions. 

• The Bayes-optimal estimator already exhibits the best 
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possible estimation performance. Even so, we showed 
that the performance gap between the Bayes-optimal JCD 
estimator with and without perfect CSIR still cannot 
be negligible in the quantized MIMO system. We also 
discussed the ways to reduce the gap and then concluded 
that achieving the same performance as the full CSIR 
case in the quantized MIMO system is very difficult. 

Many potential directions for future work are available. The 
GAMP-based JCD algorithm presented in this paper is a first 
step toward achieving the optimal JCD estimate under the 
quantized MIMO system. The computational complexity of 
the GAMP-based JCD algorithm may still be too high to be 
affordable in a commercial system. One possible solution is 
to adopt other suboptimal schemes such as linear estimators. 
Another feasible solution is using mixed-ADC receiver archi¬ 
tecture [12] wherein a small number of high-resolution ADCs 
is available. Thus, CSIR gains high accuracy and facilitates 
the JCD procedure. For a development in this direction, see 
[54]. 


where we have used d^{x)/dp = (t){x)dx/d'p. Using (62), 
(63) can be rearranged as 



Pout{Y\z)N{z]p,vP/2)<lz 


= pC + 


sign(y)ti^’ 

^2(0-2 + vP) 


(Hvi) - , 


(64) 


where 772 and Vi are given by (25). Multiplying both sizes by 
1/C, we obtain the marginal posterior mean given in (23). 


Similarly, (24) can be calculated by differentiating (62) 
twice as 



2pz p^ 1 \ 

{vP/2Y^ {vP/2Y vP/ 2 ) 

^ P out {X\z)N{z-,p, vP/2) dz 


-1 


(ct 2 -h vP)/2 




(65) 


which then can be rearranged as 


Appendix A; Derivations of (23) and (24) 

In this appendix, we derive the expressions (23) and (24), by 
applying the techniques in [33, Chapter 3.9]. The derivations 
below are only dedicated for the real part of the estimator 
because the imaginary part of the estimator can be obtained 
analogously. Note that the signal power and noise power are 
vP/2 and <7^/2, respectively, per real and imaginary part. For 
ease of notation, we have abused Y, Z, and p to denote Re(F), 
Ke{Z), and Re(p), respectively. 

To get (23), we begin by deriving the denominator of (17). 
First, recall from (8) that if F € and F < 0, the 

likelihood is given by 


E{Z21 p, vP/2} = 2p E{Z\p, vP/2} + /2 - f) 

~^ 2(ir+ 

We also note that 

\/ar{z\p,vP/2} = E{Z^\p,vP/2} - (E{z\p,vP/2}f . 

(67) 

After plugging (66) and (23) into (67)^ we obtain {2Y). In the 
above derivations, we have assumed F < 0. For F > 0, the 
above derivations can be used in the same way. 


Pout(U|Z) = $ 




n-i - z \ 


( 61 ) 


Appendix B; Proof of Proposition 1 


Note that for the special case & = 1, we have tq = — 00 , and 
the second term of (61) will disappear. Substituting (61) into 
the denominator of (17), it can be shown that'^ 


Pout {Y\z)M{z;p,vP/2) dz 

sign(F)p- |rb| ^ sign(F)p- |rj,_i| ^ 

+ V V{< + vP)/2 ) 

(62) 

Differentiating w.r.t. p on both sides of (62) yields 


^ sign(F) / / sign(F)j3- \rb\ '\ 

^{al+vP)/2 y[V«+v^)/‘2) 

_J sign(F)p- 

^{al+vP)/2 JJ’ 


(63) 


*^The calculation can be done by using the Gaussian reproduction property 
given by footnote 5. 


Using the replica method, we first compute the replicate par¬ 
tition function Ey {P’^(Y)} in (35), which with the definition 
of (11) can be expressed as 



where we define with and 

being the a-th replica of H and X, respectively, and AT = 
{X(“),Va} and n = {H(“),Va}. Here, (H(“),X(“)) are 
random matrices taken from the distribution (Ph,Px) for 
a = 0,1,..., T. In addition, J dY denotes the integral w.r.t. a 
discrete measure because the quantized output Y is a finite 
set. We will calculate the right-hand side of (68), by applying 
the techniques in [38,39,55] after additional manipulations. 

To average over (Ti, AT), we introduce two (t-I-I)x(t-I-I) 
matrices Qjf = and Qx„ = [Qx ] for o G {t, d} whose 
elements are defined by and = 

for t G To, where, is the nth row vector 
of and Xj“^ is the fth column vector of X(“) for f G 7t 
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or 7d- The definitions of Qh and Qxd are equivalent to 

N T 


1 = 


n n 5(hW(h(“))t-KQ^^)dg 


ab 


n—1 0<a<6 


1= / n n n 

oGjt.d} t^To 0<a<b 

where 6{-) denotes Dirac’s delta. Let Qx = {Qx^Vo} 
Z ^ {Z(“\Va}. Inserting the above into (68) yields 


-Y 

where 




Ey{P"(Y)}= (69) 


1 

lO 


n—10<a<b 


1 /dY fj Po,t (y 

a^O 

z(“)) 1 

fl <5(hW(h(“))t- 



y^V{Qx)^^x\ n n <)((xi“))tx('’)- 

o,tGTo 0<a<b 

Using the Fourier representation of the 6 function via auxiliary 
inatrices Qh = [Q#] € Qx = {Qx„ = 

[Qxq] € Vo} and performing the saddle point 

method for the integration over {Qh, Qx), we attain 

1 


^Ey{P^(Y)}=_ _Ext/ 


Qh ,Qx ^Qh Q: 


(70) 


with 
J-(r) A 


^ log Ez < fdYnt ]q Pout (Ynt 

yn,o,tGl~o ^ 

+ logMPiQn) - atr (QhQh^ 

+ T logM^iQx) - (Q^oQ 




(71a) 

(71b) 

(71c) 


where Extra;{/(x)} denotes the extreme value of f{x) w.r.t. x; 

M%\Qh) = In , 

mP{Qx) = Eat 

, the average free entr* 

{A.}, 


gtr(Qx„XfX„ 

^ oG{t,d} 

• Xo = [x^°l xi^^ • • • xi"”^ 

According to (35), the average free entropy turns out to be 

d 


F = lim — Extr 

^^0 dr Qh.QxAh,Q> 


assume that the saddle points follow the RS form [39] as 


Qh = (ch — qH)l + , (72a) 

Qh = {ch — Qh)^ + , (72b) 

Qxo = (cx„ — 7x„)I + 9x<,ll^, (72c) 

Qxo = (cx„ - 7xJI + ^Xoll"^- (72d) 


In addition, the application of the central limit theorem sug¬ 
gests that z„t = are Gaussian random 

vectors with (T + l)x(r+l) covariance matrix Qzf If 
t G To, then the (a, 6)th entry of Qz^ is given by 

)* z^^^ = Q|}g t = Qt ■ (73) 


As such, we set Qz^ = {chCx„ — <?ff<?x„)I + 7rr7Xol, which 
is equivalent to introducing to the Gaussian random variable 
z„i for t G To 

zif = ^/chCx, - qnqx, + x'qHqxFc, for a = 0,... T, 

(a) 

where uT and Vc are independent standard complex Gaus¬ 
sian random variables. With RS, the problem of seeking the 
extremum w.r.t. {Qh,Qxo,Qh,Qxo} 1® reduced to seek¬ 
ing the extremum over {cH,qH, cx„, qx,,CH,qH, cx,,qx,}, 
which can be obtained by equating the corresponding partial 
derivatives of the RS expression to zero. 

To this end, we calculate the RS expression of by 
substituting these RS expressions into (71a)-(71c). First, for 
(71a), we substitute (74) and perform the expectation w.r.t. Z 
and integration over Y„t, to yield 


(71o) = 2 q; ^ /3o log 

oG{t,d} 


^E„{T-b(K) (vk, (K))"} 


, &=i 


(75) 


where we define Vo = y/qnXoV and 

f 1 /•’’t _ (V2y-ZaG 'I 

^,{Vo) ^ eJ -= / dye (76) 

with Zo = ^/cHXa — qnqx^u + 14, and u and v being 
independent real standard Gaussian random variables.'^ Per¬ 
forming the expectation w.r.t. u, (76) can be expressed as (45). 
Next, we move to the RS calculation of (71b). Under the RS 
assumption, the first term of (71b) can be written as 




logE^I fl 


Then we decouple the first quadratic term in the exponent us¬ 
ing the Hubbard-Stratonovich transformation and introducing 



The saddle points of can be found by the point of zero 
gradient w.r.t. {Q//, Qh^ Qx,,} but it is still prohibitive 
to get explicit expressions about the saddle points. Thus, we 


^^Note that and Vc in (74) are standard “complex” Gaussian random 
variables. In this paper, we process the real and imaginary parts separately. 
Therefore, for ease of notation, we have rescaled all the observation outputs 
ynj and by so that the real and imaginary parts of these random 
variables can be regarded as the standard “real” Gaussian random variables. 
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the auxiliary vector yH,n G to rewrite (77) as 

I n=l 

= alog Eh| J 

(78) 

With RS, the second term of (71b) can be expressed as 
-a(^{cH +TqH){cH + TqH) +t{ch - qH){cH - qnij ■ (79) 
Similarly, for the first and second terms of (71b), we have 



for arbitrary non-negative integers jr^, ji^, ii^, jr^, 

Jix ’ Jix ■ To proceed, we define 


n,k 

k,t 

X (83) 


with OR, ai G t}, ur ^ ai and 6r,5i G r}, 

bn ^ bi. If we define the generalized free entropy as 


^ = -4 lim 




r-i.o+ dshde. 


In E,r 


^^hfh^xfx P'E 


(Y)} 


£)i=0,£x=0 

(84) 


^ /3ologExJ 

oe{t,d} I 

X ^Ex' |e^'s/7^7R®(Xx„^o)+(cxo-9Xo)|^'P 

- ^ Po(^icx, + rqx,) (cx„ + rqx,) 

oG{t,d} 

+ t{cx, - qxj{cx^ - 7xj) ■ (80) 

Putting together (78)-(80), we have the RS expression 
of jr(T) The parameters {cH,qH, cx„, ,CH,qH, cx„, qx,} 
are determined by setting the partial derivatives of to 
zeros. In doing so, as r —> 0, it is easy to get that ch = 0, 
cx„ = 0, Ch = E{|i7p}, and cx„ = E{|Xop}. In order to 
obtain the more meaningful expressions for the other param¬ 
eters, we introduce two scalar AWGN channels given in (36) 
and their associated parameters in Section IV-A. Equating the 
partial derivatives of w.r.t. {qH,qXaTQH,qXa} to zeros, 
we obtain the fixed-point equations given in (46). Finally, 
taking the partial derivatives of at t = 0, and applying 
the parameters introduced in Section IV-A, we obtain (44). 



Appendix C: Proof of Proposition 2 

Consider the (n, fc)-th and (fc,f)-th entries of H and Xd, 
respectively. We will show that the joint moments of the 
joint distribution of (i7„fc, Xd,fctj Hnk, for some indices 

(n, k) and (fc, t) converges to the joint distribution of 

P(J7)P(Yff|J7)P(i7|Yff)P(Xd)P(rxjXd)P(Xd|yxJ, (81) 

independent of (n, fc) and {k,t). Following [43], we proceed 
to calculate the joint moments 

Im(i7„fc)*'-. Re(i7„fc)'‘^'' 

Re(Xd,fct)*‘^- Im(Xd,fei)*^- Re(Xd,fei)'''^“= Im(Xd,fct)''“=} ( 82 ) 


it exactly provides the joint moments of interest. 

As £?i = 0 and = 0, EY{e'^'*7'*®“=7“’P'^(Y)} reduces to 
Ey{P’^(Y)} given in (68). Therefore, proceeding with the 
same steps as in Appendix B from (68) to (70), we get 

^^ Le^he^Upr(Y)] = Extr f|, (85) 

f ^ Qh,QxAh,Qx '' ’ 

where is exactly identical to (71) while and 

■Ylx^(Qx) should be replaced by 

M%\Qh) = “") |, 

M4iQx) = Bxle^-^^ [] gtr(Qx„X-X„)| 

I oe{t,d} J 

Thus, except for and (Qx), the RS ex¬ 

pressions for the other parts of can be obtained as in 
Appendix B. We now only need to obtain the RS expressions 
for log7Vf^^(Qrr) and log7VfY^(Qx)- The generalized free 
energy in (84) becomes 

P = J dYndYx, Eff{(Re(i7))*^'*(Im(i7))*"-P(Yff|i7)} 

X EH{{Re{H)y^^ (Im(i7))^''. P{H\Yh)} 

" -V-^ 

ne(Hy^h 

X ExJ(Re(Xd))*^“=(Im(Xd))*^“=P(YxjXd)} 

X Ex,{(Re(Xd(Im(Xd)P(Xd | Yx,)} 

'-y-^ 

Re(Xd)*^a; Im(Xcl)^^a: 

( 86 ) 

which is the joint moments of (77, Xd, 77, Xd). Consequently, 
the joint moment of interest is thus uniquely determined by 
(86) due to the Carleman theorem. 

Appendix D: Proof of Proposition 3 
In this derivation, we consider the case at infinity SNR, 
i.e. CT^, = 0, and we let cr^ = 1 and = 1 without loss 
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of generality. From (56), as j5x. —>■ oo, we have qn —>■ oo. 
An application of the Taylor expansion yields 1 — mse// = 
(1 + « 1 — llqn, and thus we have 


mse// « l/qn- (87) 

Let u = ■ We then evaluate Xt in (58) by 

changing the integration variable from v to u, which yields 


Xt = 


cb 


^mse//(l - mse^)’ 


where 


Cb 


= E 

6=1 * 


g 2(l-msejf) 


X 


{<P{z) - (t>{z - 


'/2{rh-ri-i) 

VmseH 


$(z) - - 




VmseJF 

As mse// —)■ 0, cb can be approximated by 

2® „ __2 


1 


Cb 


(27r)3/2 


E' 

6=1 


^{z 


-dz, 


( 88 ) 


-dz. (89) 


(90) 


which is a quantizer-dependent constant. Using q^ = PtXt 
given in (58) and combining (87) and (88), we obtain 
mse// « (/3tCB)“^ or (59) in dB scale, wherein Cb = 
—20 logiQ(cB). The values of Cb in Table I are obtained from 
(89) numerically. 


where 


d>biVo) = $ 


V2n - K 


\/^w + {CH^C-Xa - 9fffc9Xd,fc) / 

_ '/^rb-i - Vq 

\/^w + {cHkCXa - qn^qxo 


- $ 


(93) 


Vo = x^(qH,qx^,Jv for o e {t, d}; 
is the mutual information between and 

I(^d,kjVx^ k) i® mutual information between 

Yxd.fc and Xd,k; and cx„ = cr^„, ch^ = cr^^- In (92), the 
other parameters {gx,,,*,, gr/k, 9/6^} are obtained from 

the solutions to the fixed-point equations 


qH;, = A.fcgx,.,Xt + /?dgXd.;,Xci, = ch^ - mse^/,, 


qx,,k = otqHkXti 

qx^,k = otqHkXd, 


(94a) 

gx.,;, = cx. - msex,.,, (94b) 
gxd.fc = cxd - msexd.,, (94c) 


where msex, ^ = 0, and mse//^ and msexj are the MSEs 
of the Bayes-optimal estimators over (91b) and (91a), respec¬ 
tively. In (46), we have defined 


- E / 


6=1 ' 


(^6 {V^XUhdlxZ^v) 

^6 (\/ (gfffcgx„,fc)c. 


with Tifc)-) given by (93) and %{Vo) = 


for o G {t, d} 

(95) 


Appendix E: A Generalization of Proposition 1 

In this Appendix, we extend Proposition 1 into the case 
where users have different large-scale fading factors . This 
task can be performed by proceeding with the same steps as 
in Appendix A, and the proof is omitted. 

Similar to (36), we define the scalar AWGN channels for 
this case: 

Ixd.fc = ^gXd.fcTfd.fe + ITxd,fc, (91a) 

Yh, = V^kHk + Wn,, (91b) 

where lUxd,^,!!^^ ~ A/c(0,1), Xd,k ^ Pxj, and Hk ~ 
Ph^ = A/c(0,ct^^). Eor ease of notation, we use (ofc) = 
^ represent the average over a set {ak '■ k = 

Proposition 4: As AT —> oo, the asymptotic free entropy is 

E ntf Dn ^6(Uo)logT'b(Uo) J 

oe{t,d} \6=i“^ / 

- a{I{Hk;YHjqHj) - /3d(7(2fd,/c; Exd_ Jgxd.^)) 

+ a{{cH^ - qHk)qHk) + E /5o((cx<, - gx„,fc)gXo,fc), 

oG{t, d} 

( 92 ) 
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